lobSTR: A Short Tandem Repeat Profiler for Personal Genomes

نویسندگان

  • Melissa Gymrek
  • David Golan
  • Saharon Rosset
  • Yaniv Erlich
چکیده

Short tandem repeats (STRs) have a wide range of applications, including medical genetics, forensics, and genetic genealogy. High-throughput sequencing (HTS) has the potential to profile hundreds of thousands of STR loci. However, mainstream bioinformatics pipelines are inadequate for the task. These pipelines treat STR mapping as gapped alignment, which results in cumbersome processing times and a biased sampling of STR alleles. Here, we present lobSTR, a novel method for profiling STRs in personal genomes. lobSTR harnesses concepts from signal processing and statistical learning to avoid gapped alignment and to address the specific noise patterns in STR calling. The speed and reliability of lobSTR exceed the performance of current mainstream algorithms for STR profiling. We validated lobSTR's accuracy by measuring its consistency in calling STRs from whole-genome sequencing of two biological replicates from the same individual, by tracing Mendelian inheritance patterns in STR alleles in whole-genome sequencing of a HapMap trio, and by comparing lobSTR results to traditional molecular techniques. Encouraged by the speed and accuracy of lobSTR, we used the algorithm to conduct a comprehensive survey of STR variations in a deeply sequenced personal genome. We traced the mutation dynamics of close to 100,000 STR loci and observed more than 50,000 STR variations in a single genome. lobSTR's implementation is an end-to-end solution. The package accepts raw sequencing reads and provides the user with the genotyping results. It is written in C/C++, includes multi-threading capabilities, and is compatible with the BAM format.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Genetic Variation of Informative Short Tandem Repeat (STR) Loci in an Iranian Population

In the present study, genotyping of six short tandem repeat (STR) loci including CSF1PO, D16S539, F13A01, F13B, LPL and HPRTB was performed on genomic DNA from 127 unrelated individuals from the Iranian province of Isfahan. The results indicated that the allele and genotype distributions were in accordance with Hardy-Weinberg expectations. The observed heterozygosity (Ho), expected heterozygosi...

متن کامل

DNA typing from skeletal remains: evaluation of multiplex and megaplex STR systems on DNA isolated from bone and teeth samples.

AIM To evaluate the performance of three multiplex short tandem repeat (STR) systems (AmpflSTR Profiler, AmpflSTR Profiler Plus, and AmpflSTR COfiler), and a megaplex STR system (PowerPlex 16) on DNA extracted from the skeletal remains. By performing a microbial DNA challenge study, we also evaluated the influence of microbial DNA on human DNA typing. METHODS A subset of 86 DNA extracts isola...

متن کامل

Rapid detection of expanded short tandem repeats in personal genomics using hybrid sequencing

MOTIVATION Long expansions of short tandem repeats (STRs), i.e. DNA repeats of 2-6 nt, are associated with some genetic diseases. Cost-efficient high-throughput sequencing can quickly produce billions of short reads that would be useful for uncovering disease-associated STRs. However, enumerating STRs in short reads remains largely unexplored because of the difficulty in elucidating STRs much l...

متن کامل

Identifying personal genomes by surname inference.

Sharing sequencing data sets without identifiers has become a common practice in genomics. Here, we report that surnames can be recovered from personal genomes by profiling short tandem repeats on the Y chromosome (Y-STRs) and querying recreational genetic genealogy databases. We show that a combination of a surname with other types of metadata, such as age and state, can be used to triangulate...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Genome research

دوره 22 6  شماره 

صفحات  -

تاریخ انتشار 2012